On the derivatives of the sigmoid

نویسندگان

  • Ali A. Minai
  • Ronald D. Williams
چکیده

-The sigmoid fimction & very widely used as a neuron activation fimction in artificial neural networks. which makes its attributes a matter o f some interest. This paper presents some general results on the derivatives o f the sigmoid. These results relate the coefficients o f various derivatives to standard number sequences from combinatorial theory, and thus provide a standard efficient way o f calculating these derivatives. Such derivatives can be used in approximating the effect o f input perturbations on the output o f single neurons and could also be usefid in statistical modeling applications. Keywords--Neuron activation function, Sigmoid function, Logistic function, Eulerian numbers, Stifling numbers. 1. I N T R O D U C T I O N The sigmoid function has found extensive use as a nonlinear activation function for neurons in artificial neural networks. Thus, it is of some interest to explore its characteristics. In this paper, we study the derivatives of the l-dimensional sigmoid function I y = a(x; w) 1 + e -w" ' ( 1 ) where x ~ Y/is the independent variable and w E Y/ is a weight parameter. In particular, we present recurrence relations for calculating derivatives of any order, and show that the coefficients generated by these recurrences are directly related to standard number theoretic sequences. Thus, our formulation can be used to determine higher derivatives of the sigmoid directly from standard tables. 2. MOTIVATION The sigmoid in eqn ( 1 ) is also called the logist icf imction, and can be seen as representing a "neuron" with Acknowledgements: This research was supported by the Center for Semicustom Integrated Systems at the University of Virginia and the Virginia Center for Innovative Technology. We would like to thank Prof. Worthy N. Martin and Prof. Bruce A. Chartres for their valuable suggestions. Dr. Minai would also like to thank the Department of Neurosurgery, University of Virginia, and especially Prof. William B. Levy for their support. The paper has also benefited considerably from the suggestions of two anonymous reviewers. Requests for reprints should be sent to Dr. Ali A. Minai, Department of Neurosurgery, Box 420, Health Sciences Center, University of Virginia, Charlottesville, VA 22908. one input. However, it can also be given a more general interpretation. Let i be a neuron with n inputs A), 1 < j < n, and let the output y of the neuron be given by: 1 y (2.1) 1 + e " z = ~ wox j + O = wx + O, (2.2) J where 0 is a bias value, x = [x~, x: . . . . . xn] and w = [wi~, wi: . . . . . w~]. This defines a sigmoidal step in n + 1 dimensions (henceforth called an n-dimensional s igmoid), oriented in the direction of the n-dimensional vector w (Figure 1 ). The use of this sigmoid function in neural networks derives in part from its utility in Bayesian estimation of classification probabilities. It arises as the expression for the posterior probability when the weights are interpreted as logs of prior probability ratios (Stolorz et al., 1992). The sigmoid is also attractive as an activation function because of its monotonicity and its simple form. The form of its lower derivatives also makes it attractive for learning algorithms like back propagation (Werbos, 1974). It is clear from eqns (2.1, 2.2) that the value o f y remains unchanged along n-dimensional hyperplanes orthogonal to w. If0 = 0, the n-dimensional hyperplane z = 0 (y = 0.5), passes through the origin x = 0. If 0 is not 0, the sigmoid is shifted along w by a distance -0 / I I w II, where II II is the Euclidean norm. The shape of the sigmoid (specifically, the sharpness of its nonlinearity) is determined by 11 w ]l, with a larger value giving a sharper sigmoid. Figure 2 shows a 1-dimensional sigmoid with different weight values. It can be shown quite easily ( Minai, 1992) that trac-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Vibration analysis of FGM cylindrical shells under various boundary conditions

In this paper, a unified analytical approach is proposed to investigate vibrational behavior of functionally graded shells. Theoretical formulation is established based on Sanders’ thin shell theory. The modal forms are assumed to have the axial dependency in the form of Fourier series whose derivatives are legitimized using Stokes transformation. Material properties are assumed to be graded in...

متن کامل

Fekete-Szeg"o problems for analytic functions in the space of logistic sigmoid functions based on quasi-subordination

In this paper, we define new subclasses ${S}^{*}_{q}(alpha,Phi),$ ${M}_{q}(alpha,Phi)$ and ${L}_{q}(alpha,Phi)$ of analytic functions in the space of logistic sigmoid functions based on quasi--subordination and determine the initial coefficient estimates $|a_2|$ and $|a_3|$ and also determine the relevant connection to the classical Fekete--Szeg"o inequalities. Further, we discuss the improved ...

متن کامل

Idiopathic Perforation of the Sigmoid Colon in a 2.5 Years Old Girl: A Case Report

    Idiopathic colon perforation is rare in children. It is more common at the extremes of age. Splenic flextures, ileocecal and lower sigmoid regions are the most common sites of perforation. Delay in proper management of this condition is associated with high mortality and morbidity rate. We report on the case of a 2.5 years old girl who presented with fever, diarrhea, nausea and vomiting and...

متن کامل

The Effect of Scattering from Leg Region on Organ Doses in Prostate Brachytherapy for 103Pd, 125I and 131Cs Seeds

Introduction Dose calculation of tumor and surrounding tissues is essential during prostate brachytherapy. Three radioisotopes, namely, 125I, 103Pd, and 131Cs, are extensively used in this method. In this study, we aimed to calculate the received doses by the prostate and critical organs using the aforementioned radioactive seeds and to investigate the effect of scattering contribution for the ...

متن کامل

Coefficient bounds for a new class of univalent functions involving Salagean operator and the modified Sigmoid function

We define a new subclass of univalent function based on Salagean differential operator and obtained the initial Taylor coefficients using the techniques of Briot-Bouquet differential subordination in association with the modified sigmoid function. Further we obtain the classical Fekete-Szego inequality results.

متن کامل

Giant Submucosal Lipoma of Sigmoid Colon

  Lipomas of the gasterointestinal Tract are relatively uncommon in clinical practice. Most cases are asymptomatic with small tumor size and do not need any special treatment but the large ones are known to cause symptoms such as abdominal pain, obstruction, intussusceptions, and bleeding. The majority (90%) of these lesions are submucosal with predominantly right sided with a slight preponder...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Neural Networks

دوره 6  شماره 

صفحات  -

تاریخ انتشار 1993